TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 



Advanced HTTP Activity 

Analysis 

2009 



TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 




TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 



Goal 



The goal of this training is to get you 
familiar with basic HTTP traffic and 
understand how to target and expliot it 
using X-KEYSCORE 
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Agenda 
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HTTP stands for Hypertext Transfer 
Protocol and it’s the primary protocol for 
transferring data on the World Wide Web 



TOP SECRET//CQMINTMREL TQ USA, AUS. CAN, GBR, NZL 




TOPSECRET//COMINT//REL TO USA, AUS, CAN, 



GBR, 



Why are we interested in HTTP? 



facebook 





BSmyspace.com 

™1 n rihce lor friend* 



Because nearly everything a typical user 
does on the Internet uses HTTP 




Earth 



WikipediA 

Tfce Fnte Encycfoped fcf 



com 
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Why are we interested in HTTP? 
• Almost all web-browsing uses HTTP: 

■ Internet surfing 

- Webmail (Yahoo/Hotmail/Gmail/etc.) 

■ OSN (Facebook/MyS pace/etc.) 

■ Internet Searching (Google/Bing/etc.) 

■ Online Mapping (Google Maps/Mapquest/etc.) 
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How does HTTP work? 

■ HTTP is comprised of requests from clients to 
servers and their corresponding responses 

■ Many analysts are already familiar with the 
terms “client-to-server” or “server-to-client” 
collection (also referred to as “client side” or 
“server side” collection). 
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How does HTTP work? 

■ A “Client” is usually referring to a Browser 
(like Firefox or IE) which is also referred to as 
the “User Agent” 

■ The “Server” can also be referred to as the 
“web-server” or “origin-server” which is the 
machine that is storing the data that is being 
accessed (like a web-page, a map, an inbox, 
etc) 
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HTTP Activity 

* HTTP activity comes in two types: 



Client-to-Server 

“requests” 



Website.com 




Client 




Server-to-Client 

“responses” 
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HTTP Activity 

■ HTTP activity comes in two types: 



Website.com 





Client 



While there may be a variety of Proxies, 
Gateways or Tunnels in between the client and 
the server, traffic is always going in one direction 
or the other. 
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Client vs. Server Side Traffic 

How do you know which side you’re looking 
at? 

Client-to-Server requests are generally small 
in size and are computers talking to other 
computers 

They contain standard HTTP header fields like 
“Host:” “Accept:” “Connection” etc. 
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HTTP Activity Examples 

Client-to-Server request: 
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TOP S EC RE TfiCO MIN T TOC 320 1 03 



ID: sess_ofig_prpc 



Tvpe HTT D -GET r3> Fri it er Fi e idly Version 



DNI Display I Raw Data I DNI Format 



Services v 



GET /FazMah-Tarronsm-; iKUth-P^ar-FadWdp/1 8606^8^12 THTP/I 1 


User- Age Lit: 


MoaDa/5.0 (Windows; U; Window; NT 5 1; sn-US) AppLeWebKjt/52119 (ELHTML, like 
Gecko) ChromeJl.D. ]54 48 Safari^ 5 ID 


Referer 


http i/ifowv? go ogj a . c om. pk.'s a arc I'l'TiI— an &.q=m‘ b tt.e q "b oaks on iiizb otl ali^JbtnG=G a c gle 
Searcb&iMta= 


A c fifipt 


t=:T-±Sinl.app1i'ator.^^m1 ^appltcalaom'jfhtnil yml te3ct/html;q=0 9 text/p1am;q.=D 8 i r:i a m=: /pn g, . q=Q 5 


Accept Enc-odzig; 


gap, deflate ,bzip 2 , sdcli 


Cookie; 


!ibid--main=1 sfb552 58 1 6- £765531 




ipn u:er id=Rl YTF7QF JRTTYQ5 


A c rifipt -T . atigi ■ 


cn-T]K,an 


Accept Chare Et: 
Host: 

Connectum 


120-0359 k*,utf B 
www. atcazon c om 
Keep-Alive 
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Client vs. Server Side Traffic 

* Server-to-Client responses are generally 
larger in size and are what web-pages look 
like at the internet. 

» When you’re at a computer accessing the 
Internet, you’re only seeing Server-to-Client 
traffic. 
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HTTP Activity Examples 




1^1 Pinter Fftaidlv Versi on 



+1 Document Information 



ype: HTTP 



mil Display 



Bonus question: Why are the 
images in this web-page missing 



: -U HTTP deader Information 



Su rvictfi w 



Barca remstaiei 



pyffl 



P^gE 



Latest jvr m 5 



Kuwait goveniiiirut designs : 1 mrv er <mmn\ 

Mir., 16 Mar 1009 19:07:16 GMT 



Kuwait ccrera: 

hr.'T'ii'ir.y 

I "S' I Childhoc d dct 

L*_l I 



The Kuwaiti giiuamnnpiit ha^ ^uhmltted it- 1 : 
resiyudliun Uj tho county's amir amid d run 
oyer the premier's handling of the economic 
crisis. 



Let- jckmj 



"JS B.us.iai. t>a 



Others 



"The resignation has been submitted ferma y and 
it e up to the emir (ruler) to dec ce,‘ Reuters 
quoted Nasser al-Duvreilah, s par amentarian, as 
saying on Monday. 



JlKfofea tmnt \1 






AsiaJPacific 



I-', Ar.iitr.:. I: ovk n 



The resignat cn would further delay the appro yoi ot ..5 Oil on dinars (use) .5.11 
bullion) rescue package vh ch is to be injected to the Para an Gulf neticr e 
economy to ease the impact ef the global financial crisis 



Ljabamm evt 



: d'Tcch 



jZ?.ril:h 



no government has not commented on the report 



I -ThII; Jar'll i ^ ::.f- 



Server-to-Client Response 
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IS 


Isfahan to 
esdiibit 




esowessiotiist ait 
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HTTP Activity 

* XKS HTTP Activity Meta-data differs 
greatly depending on which side of traffic 
we’re collecting 

* In nearly all cases it’s better to have 
client-to-server traffic 
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Accept; T / g 



Ref =rer: flhE^p^7search^ftc^?Tnik73earch?kab=uriSur0td^ 

A c c s p Xr — Eli c 

ljs & t -Aqentf Mozilla/4, 0 (compatible; MSIE 6.0; Windows EFT .5,1; SY1] | 



Huy : « •! ••> « . 



CDalciel DEC-UITi-b479a5e4Ad2 3Qa53QS3d.5I3G3Q2Q3ach2 2eQ4G34ane0fclfi4c4Sf9(5efcCS4cf35OHt.Eilla42f4%2eQ%20%2acf 



C aclue - CuiiLiui: uiy>.-3taiy = u 

C Dime c ti. on : Ke &ri- Alive- 



IX-BlueCoat-Vial 66808702E9A93546 

Host 



search.bbc.co.uk 



URL Path 
/search 



URL Args 

tab=urduS:order=sortbQth&q=mLi£harraf&start= 3 Sscope=LjrdM 5 ;link=next 



Search Terms 
musharraf 



■ Language 1 


• Browser 




1 en 9 


Mozilla/ 4.0 (compatible; MSIE 6 . 0 ; Windows NT 5 . 1 ; SV 1 ) 





Via 

668 0 3702E 9 A 9054 6 



Referer 

http: //se ar ch . bb o ,c c> .u k/s earchU ab =u rd u&orde r =so rt bat h&q=mu sh ar r a f fists rt= 2fiscop e= urdu 



Coo -tie 

BBC-UIDM3479a5f4ad23OaS3O63d51363O2O3acb22B84634aOeOb1S4c45f960fcaS4cra5Olulosill^42f4%2eO%2D%2Bcom 
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HTTP Activity Server-to-Client 







rvfp i-nr 



1 1 1 1 Ih-tJei Irilon iLli jii 



B i 



La1ga< .' 4 i 7 i > 



i i.'iumiiit ixsi^bw 1 Hivcr -rcttiiniiv^ 

3T0S '.SAT '.6 GUT 



r. u-l.'CTlv 



The Pnv.-fi ti qe vnrn n ■ nr ha; submitted its 

i Hvi i i in j ii n 1 1 1 I h - i ■]■ al mi i ■! r , i * i ■ ii I n n ivr 

□i/er the prei ninr 1 ^ ImndliiK] a" Ik r n-rin nc ni it: 
rrUlt 



Tnrt*^ 



aliiikl i 



'TVa rie:ic nation ha: Itppi :l hrritc^rl tirTtali/ arid 
It's u: tatie tm-(r_l:r> :o dacide-,' ^su-e-s 
rLcn^Tl \^s:ss' .d-L'i.ruH lah. a pj»lam?ntihia--, qs 
savi’iq on h'mida;- 



T"h: iTSiig nation hvoj d Turlhc dalay t"ia approval :f 1.5 zil sen drars [USD 5 LL 
ni an i nctcua package v,hnh i; to 3K n E=ocd la the an '..ult nat rn'i 
=LUI l_T'iy Lj HdiB .lid Ilip'dLL uf -fid jIuLul rdfljd Ltij-t. 



U- 3 J 1 I 1 



Application Info 


HTTP Type 


Press TV 1 Kuwait government 'resigns 1 over economy 


response 



3 


IsSuQilfO 




c kmc j acre; act 
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HTTP Activity - HTTP Types 

* Meta-data will also tell you which side of 
traffic you’re looking at 

* Client-to-server has two main types: 



HTTP Type 


||j| 


HTTP Type 


yet 


HI 


post 



■ Server-to-client has only one: 



HTTP Type 
response 
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HTTP Activity - Get vs Post 

■ A ‘GET 5 is you requesting data from the 
server (most web surfing) 

■ A ‘POST 5 is you sending data to the 
server (i.e. signing in, filling out a form, 
composing an E-mail, uploading a file 
etc.) 
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Let’s break down the important 
parts of a client-to-server request 



TOP SECRETtfCOIVlINTMQRCONifREL TO USA, AUS, CAN, GBR, NZL 




TOPS EC RE T //COM I NT //R EL TO USA, AUS, CAN, GBR, NZL 



HTTP Client-to-Server 



GET /home.html 
^Q5t^ample.web s!te!corT ^^ 

User- Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: 1.9. 0.10) Gecko/2009042316 (USG- 
25) Firefox/3.0.1 0 

Accept: image/png, image/*;q=0.S ;7*;q-0 5 
Accept-Language: en-us, en ;q-0.5 
Accept-En coding: gzip, deflate 
Accept- Charset: ISO-3859-1 , utf-8;q=Q. 7, *; q =0. 7 
Keep-Alive: 300 
Connection; keep-alive 



First thing to note is the Host: line which tells 
you the name of the server that the client is 
requesting data from 
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Host Field 

It’s important to note, that in many cases users think 
they’re at websites like www.yahoo.com , but behind 
the scenes data is coming from a number of 
different servers without the user knowing it: 



GET hxi c too du le s/ui'Jab C uiLt acts 7mcnjmb=jdID bto.irn & jsrand=9 El] 37 EO 7 S::'Lci.d=2 1 270 5 345 9 HTTP/ 1 . 0 
Accept V* 

AcLLpl.-Ijji^uttg=:. la 

Referer httpJ/us me 575 mail yahoo . t otom£7dioiivFo]dfi;_^— XB^^MTBucraliob GRQBF’SGAlMSODMvjMT 

AyWwRhT^14kZW]d>rc2dz7im^ I 2 1 B57_AEB kHEIAANVj S:6w7TQ7j5JK a 4fY&M=Inbfl orl-dst*#. 
r<fer=up b &dlterEy= 

3E requested- wiit: XMLHfctpiReques t 

A c c c pt-En c o ding : gap. deflate 

U ftft r ~ A --- 1 m 11 U k MHB i hKl * 

Hc«Et: us.inc575.mail.5r3diooMom. 






CE G.G; Windows ITT 5,1, SV1; .NET CO. 2 0. 50727] 



Vi— i v.A m Fv a7n rum ri J'iv.vV.R C' /YE e^jUKZLv^wyaKSrjKitGOXV 7 »! EF 9 5 dL s Z 5 C 0:*; 1 riDInT 



Mo 



■■'.-.--r:/- . ... :rr-,..-i 



Bonus question: What would the impact of 

this be in how you formulate your 
X-KEYSCORE queries using the Host 
r field? 
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HTTP Client-to-Server 



GET /home.html I 

Host: sample, webs iie. co m 

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv;1.9.0.10) Gecko/2009042316 (USG- 
25) Firefox/3.0.10 

Accept: image/png, image/*; q=0. 8,*/* ;q=0. 5 
Accept-Language: en-us,en;q=G,5 
Accept-Encoding: gzip : deflate 
Accept-Charset; ISO-SSSQ-l.utf-Siq^O.T^jq^O.T 
Keep-Alive: 30D 
Connection: keep-alive 



Second the GET line tells you which files the user is 
requesting from the server. 

If you simply take that line and append it to the Host 
line you have the live public URL that the user is 
requesting: 

http://sample.website.com/home.html 
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HTTP Client-to-Server 



GET /8xample.php?regi on-iraq I 

Host: sample, webs iie. co m 

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv;1.9.0.10) Gecko/2009042316 (USG- 
25) Firefox/3.0.10 

Accept: image/png, image/*; q=0. 8,*/* ;q=0. 5 
Accept-Language: en-us,en;q=Q.5 
Accept-Encoding: gzip : deflate 
Accept-Charset; ISO-SSSQ-l.utf-Siq^O.T^jq^O.T 
Keep-Alive: 30D 
Connection: keep-alive 



When the GET line has a ? mark in it, then the GET 
request is also passing information to the server. 

So in this case the client is requesting the file 
example. php but it’s also passing along a value that 
could have been entered by the user. 
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URL Lines 

When there is a ? mark in the URL line, then X- 
KEYSCORE is breaking it up into two parts. The 
first part is called the URL Path and the second part 
is called the URL Argument. 



URL Path 




URL Args 


/search 




fa b-u rdu &or der=so rib at hS rprnu s har r a f & sia rt=3 &sc o pe- u rd unlink -n e^d 



Notice all of the “arguments” (each separated by &’s) 
in this URL: 



GET /sesEC- J i?tBb=ucdu*oEdeE=soEtJ3otJi£q=musEi£EEaf*sta:t=3G5COpe=UEduslinlE=EieKt JTTP/I* 1 

Referee ; http: //search, bbc. co. ufc/ search ?tah=ntdu* orders sc r the? chi q=mw. shat rafss tat t=2 & sc ope =ur flu 
Accept -language: en-us 

3 J Bonus question: Any idea what the 

Host: search. bbc. co .ilk 
Cookie : RBC-UID=h4?9a£f4ad230aS3063d51 
Cache-Conccol: max- s tale -0 
Connection!: Keep-Alive 
-■i-Eiue€oat-Via: 6 d8QS702E9A$&S4* 



information that is being passed in the 
URL Argument in this example are for? 



% 20 ^ 28 cc 



■ aja kl 'i iiwi m£i 



i kjr uoh, ^uo. u^n, yen, 
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HTTP Client-to-Server 



GET /home.htrril 

Host: sample. website, com 

r User-Agent” Mozi II a/5.0 (Windows^U; Windows NT 5^"er-US; rv:1_9.0.10) Gecko/2±I904231B (USG- 
25) Firefox/3.Q.1Q T 

Accept: imageypng,innage/*;q-0.8.*r;q=0.5 
Aocept-Lamg uage: en-us,en;q=Q . 5 
Accept-Encoding: gzip. deflate 
Accept-Charsel: ISO-88S9-1 l utf-8:q=0J/;q=0.7 
Keep-Alive: 300 
Connection: keep-alive 



The User-Agent line gives you information on what 
type of client is requesting the data. In this case, 
we can see that it was a Fi refox 3.0 browser from a 
Windows NT 5.1 (XP) machine. 
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User Agents 
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User Agents 

The User Agent (also known as the “browser”) can be 
very valuable. 

While it can not be trusted to be absolutely unique, in 
many cases you can use it to unwind a proxy or 
multi-user environment. 



It can also help provide hints if the origins of the 
request came from a mobile device: 



L sec- Agent. 


MoaUa/5.0 (SyintyjmOS/9 .2; U, Senes60/3 .1 NofcaE63-l/1 00 21. 110, ProffleMIDP-2.0 Cw^gurat 
like Gecko) £ dihiiM 1 3 




User- Agent: 


NokiaN72/5.0706 4 0,1 S E ries60/2.S ProffleJMIDP-2.0 Configuratioii/GIDC-1.1 



User-Agent: iPhone Mail (5H1 1) 
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HTTP Client-to-Server 



GET /home.html 

Host: sample, webs ite . co m 

User-Agent:: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.10) Gecko/200904231 6 {USG- 
25) Firefox/3.0.10 

* f V 

Accept: image/png, imager; q=0.8,*/*;q=0, 5 

Aocept-Language: en-us,en;q=(X5 
Accept-Encading: gzip. deflate 

Accept-Charset: ISO-8859-1 |Utf-8:q=Q.7/;q=Q. 7 A 

Keep-Alive: 300 
Connection: keep-alive 



The various “Accept” lines instruct the server on the 
types of responses the client can accept back. 
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Let’s look at a simplified version 
of a HTTP request and response 
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What is Web (HTTP) Activity 

This shows how a person logs on to a webpage 



- ■ ■■■ ■ ■■■■■» 

From Port 3434 click on http://www.hotmail.com 

( client ) GET Request 



To Port 80 
(Server) 




The client’s port can be any high-numbered port, 3434 is just an example 




What is Web (HTTP) Activity 

This shows how a person logs on to a webpage 



From Port 3434* 
(client) 



■ 

Click on http://www.hotmail.com 

GET Request 



To Port 80 
(Server) 



To Port 3434* 
(client) 



"Welcome to Hotmail” 

HTTP Response 



From Port 80 
(server) 




The client's port can be any high-numbered port, 3434 is just an example 




What is Web (HTTP) Activity 

This shows how a person logs on to a webpage 



From Port 3434* 
(client) 




Click on http://www.hotmail.com 

GET Request 



To Port 80 
(Server) 



To Port 3434* 
(client) 



From Port 3434* 
(client) 





"Welcome to Hotmail” 



HTTP Response 




Email Address: me@hotmail.com 
Password: Adminl 23 

POST to the Web server 



From Port 80 
(server) 



To Port 80 
(Server) 




The client’s port can be any high-numbered port, 3434 is just an example 





(HTTP) Activity 



This shows how a person logs on to a webpage 



From Port 3434 
(client) 



To Port 80 
(Server) 



Click on http://www.hotmail.com 

GET Request 



From Port 80 
(server) 



"Welcome to Hotmail 

HTTP Response 



From Port 3434 
(client) 



To Port 80 
(Server) 



Email Address: me@hotmail.com 
Password : Adminl 23 

POST to the Web server 



To Port 3434 
(client) 



From Port 80 

Welcome to your Inbox/homepage” (server) 

HTTP Response 



The client's port can bo any high-numbered port 3434 is just an example 





TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 



HTTP Activity 

Real traffic, however, can be a little more 
complicated. 

Almost all web pages are built from 
multiple files. 

For example, every single image or 
banner ad on a web page is a separate file 
that needs to be individually requested 
before the server that has the file can 
respond 



TOP SECRET//CQMINTMREL TQ USA, AUS. CAN, GBR, NZL 




TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 



HTTP Activity - Real World 



P Let’s look at the “NSA Today” home page. 




Dyramic -- H ijftesl Fiaskle C-lasf ifi^arinn is 
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Current Conditions 



DAILY 



'-■V rather 
if c m alien 



the besl minute you'll spend today 



/urts.HVr 






(UjTFOUO) NSAG Host* the CENTCOM 
Senior Enlisted Leader 



(U//FOUO) On 9 September 2DD9, N9A/CGS 
Coorc a (HBAtS) had tho pr viloge of hosting a 
vis t from Command Sergeant Major (C5M] Marvin 
nil, Command Sen or En isted Leader for United 
States Centra Command (CENTCOM). 



LTG KsIthS. 

AlOJtflrdt; 
Uhl ted 
•Swtes Army 



(tJ/VFOUOj Pic tursd; f^sA Ci e^-ws- CSA 1 Lori Brown wafeomes Command 
Senaes.if Major Marvm ftoit. Command Senior sniisted Laadar for Ur/irao 
Stotoz Con trot Commono 



(U//FOUO) This was C5M Hill's first visit to NSAG The first stop during his 
>jd wias iwitii ]TD at tie he pdesk. 5CM Jacohs, nf ITD, provided the C5M an 
overview of cur maintenance operations aid commun cations nuh. CSM Hill 
tcld the ITD personnel 'Vdli kEep doing what you do, my ccirms run Ihrouch 
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HTTP Activity - Real World 

It looks like one page, but each of the 




5 lA'Ih 



images and banners are separate data 
files that your browser pieces back 
together 
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HTTP Activity - Real World 

* In fact, to build the NSA Today home page 
it takes 34 separate files from 4 different 
servers 

- However, most people probably don’t 
notice, because the entire page loads in 
<300 milliseconds. 

* If we had a slow internet connection, we’d 
notice the images would initially be 
missing. 
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Notice that all of the images are missing. 
They are all separate server-to-client 
responses and therefore completely separate 
'‘sessions' in X-KEYSCORE or PIN WALE 
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HTTP Activity Real 
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HTTP Activity = Real World 

■ It’s important to note that not all of the data 
on one web-page came from the same 
server. 

■ For example, most of the NSA Today 

home page come from home.www.nsa , 
but the image of the current weather 
conditions came from wk- 

admiral208.corp.nsa.ic.gov 
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HTTP Activity = Real World 

■ This happens all the time on the Internet. 

* The cnn.com home page, may have an ad 
on it that was from the Google ad server 
and etc. 

■ And this does have an impact on our 
collection! 
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This is the traffic path for building the NSA 
today home page 



corpwebl nsa 



siteworks.n $a 



wk- 

admiral20B.corp 

nsa.ic.gov 



home .yaw/ ns a 
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What happens if we only have collection on 
one of the paths? 



corpwebl nsa 



siteworks.n $a 



wk- 

admiral20B.corp 

nsa.ic.gov 



home .www. ns a 
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What would that traffic look like? 



GET /current. jpg 

Host: wk-admirai208.corp.nsa.ic.gov 

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: 1.9. 0.10) Gecko/2009042316 (USG 
25) Fi refox/3. 0.10 

Accept: image/pngjmager;q-0.8 : *r;q=0 5 

Accept-Larguage: en-us P em;q=Q.5 

Accept-En coding: gzip, deflate 

Accept- Charset: ISO-8859-1, utf-8;q=0.7T;q =0.7 

Keep-Alive: 300 

Connection: keep-alive 

Referer: http://home.svww.nsa/ 

If-Modified-Since: Thu f 08 Oct 2009 19:31:56 GMT 
If-None-Match: l, d945-16c1-342db643” 

Cache-Control: max-age=0 
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What exactly is that telling us? 

• First off, we know what file they are 
requesting. 

• They want current.jpg from the wk- 
admiral208.corp.nsa.ic.gov server. 

• That’s actually a live public URL 

( http://wk-admiral 208 .corp nsa.ic.gov/current.j pg) 

■ Do we have any indication why they wanted 
that image? Answer is yes! Look at the referer 
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What exactly is that telling us? 

1 They were referred from http://home.www.nsa/ 

* The referer is in essence, telling you what site 
was “linking” to the new site. 

* Warning! The referer can act in misleading 
ways. 
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Referer Field 

1 The referer field is the address of the page 
that links to new GET request. 

■ However, this link could have been automatic 
to the user. 

* l.e. in the case of the current weather image, 
the link was automatic and the user wasn’t 
even aware of the action 
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Referer Field 

The referer field could also indicate a user 
action. 

For example, imagine we were on the NSA 
Today webpage and clicked the link to the SID 
Today page. 

What would that traffic look like? 
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Referer Field 



GET/ 

Host: sidtoday.nsa 

user-Agent: Mozma/5.0 (Windows: U; Windows NT 5.1; en-US; rv:1.9.0.10) 
Gecko/2009042316 (USG-25) Firefox/3.0.10 

Accept: text/htrnl,application/xhtml+xml,application/xml;q=0.9 ,*/*;q=0.8 

Accept-Language: en-us,en;q=0.5 

Accept-Encoding: gzip, deflate 

Accept-Charset: ISO-8859-1 ,utf-8;q=0.7,*;q=0 .7 

Keep-Alive: 300 

Connection: keeo-alive 

Referer: http://home.www.nsa/ 

Cookie: CFiu=o6oz36; CFTOKEi\i=66534796; 

CFGLOBALS=urltoken%3DCFID%23%3D565238%26CFTOKEN%23%3D665347 

96%26jsessionid%23%3Da830dba3a04b67ae6e351b7463444f72496d%23lastvisit 

%3D%7Bts%20%272009%2D10%2D09%2015%3A38%3A04%27%7D%23timecr 

eated%3D%7Bts%20%272009%2D06%2D19%2010%3A27%3A23%27%7D%23h 

itcount%3D13%23cftoken%3D66534796%23cfid%3D565238%23; 

JSESSIONID=a330dba3a04b67ae6e351b7463444f72496d 
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Referer Field 

Now we’re seeing a request go to host 
“sidtoday.nsa” with the referer from 
http://home.www.nsa 

How can we tell from the traffic that the first 
automatic referer we saw for the current 
weather was any different from the user- 
generated referer we saw for the SID Today 
article? 
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Cookies! 
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Cookies 

Cookies are small pieces of text-based data stored 
on your machine by your web browser. 

Almost all websites have cookies enabled and they 
have a variety of uses, including to help the web-site 
track the activities of their users. 

Most analysts are probably familiar with “machine 
specific cookies” like the Yahoo B cookie 

However cookies are used for a variety of reasons 
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What can cookies be used for? 

Cookies can be used to authenticate a user. 

For example in many cases, the “active user” 
for Yahoo web-mail traffic is seen encoded in 
the 1= part of the cookie string. 



M 


V=1 

j-'| 1-'1 i=u _-| iki i~- 






I - ictjO d_ 1 0 oOpS 3 St’o { Y nliQ o la e?il ill: ) 




r 


LZ= 

c=jb 

LcFCfi-lTS | LaugtiAga/eaifLtMit: English) 
inltNus ( CoimtLy: United States ) 

■ 


■] 
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What can cookies be used for? 

* Cookies can be used to store information 
about the user that the website is interseted in 

* Look at how the p= value below tells the 
website information about the user of this 
account: 



7 =] 

n=4 e-iq-l 6 5 3;ief 

p=£2tonrcy0 1 2 00 00 00 f Gentler: female* ISirtii year: I9S4, lVtsknl rode: 

LZ= 

t-jb 

LcFCfl-lTS { L '<u ;ztl;i gs/c orient : English j 
in)G=us ( C muitiy: United States } 
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What can cookies be used for? 

* Cookies can be used to identify a single 
machine from hundreds of other users on the 
same proxy IP address 

• The Yahoo B cookie is a “machine specific 
cookie” 



I | 

£ d=Gt YE Q vWQEGtVn'WhaxlFNw - 

i=SOSsE.4 0wqEO5oGGF24Jh 
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What can cookies be used for? 

Important note: All three of those examples 
are just subsets of the full Yahoo cookie string 
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How do we know what each cookie 

value is used for? 

* Nearly every web~site uses cookies that in 
most cases they designed for their own uses, 
so how do we know what they all mean? 

* Protocol Exploitation can examine the traffic to 
try to determine if there is any information 
contained in cookie strings that we might be 
interested, for example we’d like to know if 
any part of the cookie acts like a “machine 
specific cookie.” 
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How do we know what each cookie 

value is used for? 

However, there are far more cookie options 
out in the wild than PE can possible examine. 

So even if they aren’t aware of a machine 
specific cookie, it doesn’t mean that it doesn’t 
exist. 

X-KEYSCORE gives you access to the full 
cookie string, so if you’re adventurous enough 
you can do your own protocol exploitation. 



TOP SECRET//CQMINTMREL TO USA, AUS, CAN, GBR, NZL 




TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 



Remember: Cookies are there for a reason! 

Websites put cookies on people’s computers 
for a reason. 

If the data is valuable for a website, it may be 
valuable to us as well. 
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How long do cookies live for? 



* Cookies, like any other file on a computer, can 
be deleted by the user. 

* Almost all browsers give you the option to 
view, manage and delete your cookies 



Cookies 



□03 



The fidlo/iinq codecs ere stersd :n '^our :onq: eta- 



CkHc Fv:rr: 



i L'IjI'ja 1 1C 1 yyhryjiiM.k-iW^ 

i _ brain, rS.r.nsa 
I calenders .ops ;,n33 
ID sentciiehn do.nsa 

I Dssre.dv nsd 

- Dcybertrars-seeure.eitic.riM .t.gov' 
± ) joumd.ibd. nso 
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□orient: cro nxEr* ssectsJ> 
Host: :ra oocko 5DGCtcJ> 
Path: -:ro ootfie seeeteJ> 
5end for. <ra oocl]eseect“J> 
&pirw: -era metis ssectsJ?* 
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Cookies 



You can see what cookies have been stored on your machine by going into 
the "options" window of your browser and selecting ‘show cookies” 




Ogypns 



a ■■■ 
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I llstoi y 



0 keep my history for at least 
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- Cookies 

0 Accept cookies fra in sites 
a Accept thlfd-oa 'b, cookies 



Keep l nfcil : they empire 



Privet h 1 ivi ■ a 



0 Always dear my p-ivale data when! dose Fire : : ax 
j Ask rie before de a- fig private cat a 












TOP SECRET//COMINT//ORCON/REL TO USA, AUS, CAN, GBR, NZL 



Searches 
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Searching the Internet 

When a user searches the Internet from one 
of the many web-based search engines 
(Google, Bing, etc.) what does the traffic look 
like? 
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Searching the Internet: Client-to-Server 

* In most cases, the client-to-server traffic is a 
GET request where the search term is 
passed in the URL Arguments: 



GET /search?hl=fr&q=irar&lr= HTTP/1.1 
Host: www.google.com 

Accept: image/gif. image/x-xbitmap, image/jpeg, image/pjpeg T application/x-shock wave-flash, 
application/vnd.ms-powerpoint application/vnd.ms-excel, application/msword, */* 

Cookie: PREF=ID=74f6d7addF51ccd4:U=ccbee9ee665a7dde:TB=2:TM=1255354439:LM=1 25543326 
4:S= M1i4Rf02ohl81maNID=27=cMFLkpovJCIWIOFC5E3Pu2C6-8 nsMS2zztfvQew9- 
QYDPWUza4AscyoglQRGNSkDZsi2jL65 flM-R4HgovMBEa66bfiTXn8TH3Ukm^ 

X5hp45rLAb Y3rNZ42HGlzyne 

Accept-Encoding: gzip, deflate 

User-Agent: Mozilla/4.0 (compatible; MSIE 6,0; Windows NT 5,1) 

Connection: Keep-Alive 
Cache-Control: no-cache 
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Searching the Internet: Client-to-Server 

Notice how the URL Path is /search and one 
part of the URL argument is q=iran 

Each website can configure their URL’s 
differently, so while with Google the search 
term is contained in the q= part of the URL, a 
different search form might have it as query= 
or search term= etc. 



ht1p://www youtube. com/results?seardi_query=irart&search_type=&aq=o 
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Searching the Internet: Client-to-Server 

X-KEYSCORE tries to account for all the 
variations of search terms contained in the 
URL Argument for what it extracts for the 
“Search Term” column. 

However, there are always other varieties 
out there that we haven’t built it hooks for 
yet, so anytime you see something that you 
think should be extracted, please contact the 
team ( ) 
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“Referer Searches” 

What happens when a user clicks on a 
search result? 

Let’s start by showing the query itself, in this 
example, we’re going to query the NSANet 
Google for “XKEYSCORE” 
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GET /search?q=xkeyscore&btnG=Google 
Host: google4.q.nsa 

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5 1; en-US; rv: 1.9.0. 10) Gecko/2009042316 
(USG-25) Fi refox/3. 0.10 

Accept: text/html . application/xhtm l+xm I, application/xml ;q =0. 9.7* ;q=0.3 

Accept-Language: erHjs.en^Q.S 

Accept-Er coding: gzip : deftate 

Accept-Charset: ISO-8859- 1 , utf-8;q=0.7,*;q=0.7 

Keep-Alive: 300 

Connection: keep-alive 



We know from this session that the client is 
requesting the data from the host f google4.q.nsa” and 
we see the search term in the URL Argument 



TQPSECRETtfCOIVlINTtfREL TQ USA, AUS 5 CAN, GBR, NZL 








TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 



“Referer Searches” 

What happens when a user clicks on a 
search result? 



w 



htt p : / / xkey sc or e , Y 1 - r , nsaf r edm in ej 






GET /red mine 

Host: xkeyscore.rl.rnsa 

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5 1; en-US; rv: 1.9.0. 10) Gecko/2009042316 
(USG-25) Fi refox/3. 0.10 

Accept: text/html . application/xhtm l+xm I, application/xml ;q =0. 9.7* ;q=0.3 

Accept-Language: en-us,en;q=0.5 

Accept-Er coding: gzip : deftate 

Accept-Charset: ISO-8859- 1 , utf-8;q=0.7,*;q=0.7 

Keep-Alive: 300 

Connection: keep-alive 

Cookie: _session_id=ffd87ac8682e3faQf421 b4ffdf9693ae 

Refe rer: http://g oog Ie4.q. nsa/sea rch?q=xkey scored bin G=Gaogle+ Search 



First, we can determine the full URL 
by adding the GET line to the host 
http://xkeyscore.r1 .r.nsa/redmine 
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“Referer Searches” 

* Secondly, we get some hints as to why the 
user was requesting that page from the 
Referer line: 



Referer: ht1p://google4.q,nsa/search?q=xkeyscore&btr)G=GciCigle+Searoh 



* Note that it was the same URL that we were 
at immediately before we clicked the “result” 
link 
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Let’s look at that process again 



xkeyscore.r .r.nsa 



First, a client-to- 
server request is 
sent that contains 
the query on 
'‘xkeyscore 5 ' 
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Referer Searches 



Let’s look at that process again 



xkeyscore.r .r.nsa 



Second; the server 
responds beck with 
the search results 
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“Referer Searches” 




Let’s look at that process again: 




goog e4.q nsa 




xkeyscore . c 1 . r. nsa 



Third, by clicking on one of 
the results, a new GET 
request is issued to retrieve 
the XKEYSCORE home 
page. In this request, the 
location of the original 
search is listed as the 
“referer” 
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What will happen if we 
only have collection on 
this link? 



xkeyscore . r 1 . r. nsa 
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“Referer Searches” 

When XKEYSCORE sees a search 
contained in the “referer” field, we still extract 
it out as meta-data into the “search terms” 
but we append it with (referer) to denote 
where it was originally found: 



Search Terms ± 

1 r ef ere r}tlie legal status of tlie cnspmn sea 



HTTP Ty^c 


Host 


URL h.lU i 




URL Arga 


yet 


wiNiv.|jai sin uea .eon i 




lll_3C3ltllSLlltll^ 





Referer 

littiK.vwww.gv-o-yN.c-anL's^ai -tdi ?lil=fa&s«ui ce-liihS q=tlie-hlet|iil+statiis+&f+tlie-Hsis|}ian+seaSii - 
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“Referer Searches” 



GET /law/casp ian_status.html HTTP/1.1 
Accept: */* 

Host: wwiv.pars1imes.com 

R efe r e r: http : //www.g o og le . com/se arch ? h I =fa &so u rce= h p&q =the+ legs 1+ staius +c f+the+ cas pia n +se a& I r= 
Acce pt- La ng na g e: fa 
Accept-Encoding; gzip, deflate 

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0: Windows NT 5.1: SV1; .NET CIR 2.0.50727; infoPath.2) 
Cache-Control: max-stale=0 
Connection close 

X-BlueCoat-Via: 0ASF5353GF3F63EE 



Can we guess what happened here? 
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ID: sessjJiiLjjir'CiL: 
F niter Fi eidlv Version! 



DNI Format 



Services v 



Another example 



GFT /FT pirV? e;11 p. K-T r rrori cn - , iKUHi-PaJinrr-Farik/-dp/l 8606^8^12 THTP/I 1 


User- Age mt: 


MoaDa/5.0 (Windows; U; Wind ows NT 5 1; en- US) AppLeWebKjt/52119 (ELHTML, like 
Gecko) Chrome- 1 .0. 154 48 Safari^ 5 19 


Referer 


http ://www go ogj a . c om. pk.'s a arc Fi?hl=en &.q=m- b tt.e n "b oaks on iiizb oil ali^JbtnG=G o c gle 
Searcb&iMta= 


A c cfipt 


teid/Kml.applicatiofiiWikappTica.lioniVhtriiil 3tml te3ct/html;q=0 9 text/p1am;q.=D agp.ipng, :+:,:t: .r[^l!) 5 


Accept Encodzig; 


gap, deflate ,bzip 2, rich 


Cookie: 


!ibid--main=1 g||552 58 1 6- £765531 




apn u:er id=Rl YTF7QF JRTTYQ5 


A a ecpt -Language' 


cn-T]K,an 


Accept Chare Et: 
Host: 

Conner: turn 


120-0359 l,*,uif- 8 
www. amazon c om 
Keep-Alive 
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Proxy Information 
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Proxy Information 

In a lot of cases we're going to see HTTP 
Activity from behind a proxy or proxies. 

What is a proxy? 

> A proxy is a server that is acting as an 
intermediary for HTTP requests from clients 

Why do proxies exists? 



• Performance: 

• Censorship: 

• Security: 

• Access-Control: 



Proxy can cache responses for static pages 

Proxy can filter traffic 

Proxy can look for malware 

Proxy can control access to restricted content 
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Proxy Information 

Routinely, we’re going to see ISP level 
proxies. 

That is, instead of having each individual 
user request web pages directly from the 
web servers, the ISP is going to collect all of 
those requests first, and then proxy them out 
through a handful of proxy IP addresses. 

When the response is returned, the proxy 
passes it on to the appriopriate user 
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Proxy Information 

Why would the ISP want to proxy traffic? 

In many cases the ISP won’t have to supply 
public IP addresses to all it’s users 

It can simply give them a private IP address, 
and then use a handful of public IP 
addresses for its proxies which are the 
machines actually requesting the traffic from 
the web-servers 



TOP SECRET//CQMINTMREL TQ USA, AUS, CAN, GBR, NZL 




Proxies on the Internet 




Direct-Connect 



Mixed-Gateway 



imniimm 
:nm 



inmnnmi 
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User-to-Proxy 



= Web-Server - Web-Servers 



Short-lived connections 
Single-user 



Short-lived connections 
Multiple-users multiplexed 



The Internet 



= Web-Servers 

Long-lived connections 
Multiple-users multiplexed 



National-Level Proxy 



Cache 
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Identifying a Proxy 

* How do you know that the IP address that 
you think is your target is really a proxy? 

■ First step, check NKB. 

* They have services that attempt* to 
automatically detect proxies 

* These services are in no way 100% accurate so this is only the first step 

checking to see if the IP Address is a proxy 
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Identifying a Proxy: NKB 



Qu e ry ; IP Ad dr c j s 

Dote : ^U0‘=J- 1 O' -27«D W : U 1 : SO 


Description 


Value 


Conti den dc 


Classification 


Liju-dliun 


I'.l A n. 1 1 v‘ 1 li trill 




<b 


IP P.ongc QJ 






(TS//SI/,'RC_ TO USA, FVCY) 


Lat/Long (precision) (J) 


(none found) 




(TSd/BV/REL TO USA, FVEY) 


City Qj 


ZAHEDAN 


i — i . i — — i 

20 


(JS//SI//REL TO USA, FVEY) 


Country iTl 


IP. [[RAN) 


91 


(TS//BV/R£L TO USA, FVEY) 


Provider 


fluid r,n-i i>^.i iiipi in 






ip owner |T| 


R A VANE FARA2 JRaWSHAHR COMPANY INTERNET SERVICE PROVIDER 


S2 


(U//FOUO) 


130 


Autonomous system P.our? Prefix yj 


Hii.u;n 


1 II 1 

SO 


(U/7FOUO) 




Autonomous System Number QQ 


1ES80 


1 i 1 i II 

95 


(LV/FOUO) 


Autonomous System Nome |J] 


DCE A 5 DC[ Autonomous System 


i i i i ii 

ys 


(uVrouu) 



Dhvmuh fll Icl a I Jll| HU 



>30 


FqEiN 


Si 


(none found) 






ill 


Domain iTl 




in l . l 

30 


(U/'/FOUO) 


r 


Se-vice 


PROXY 




(U//FOUO) 




SR-vii-a 


T BANS PAP FNTPBOKY 




fl J.*'/FOUOT 
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Identifying a Proxy 

Other things to be on the look out for: 
X-Forwarded-For IP Address 

. What is it? 

■ An X-Forwarded-For IP address the proxy 
passing on to the server what it thinks is the IP 
address of the user 

> Think of it as the proxy telling the server ‘ this is 
who I think this request came from” 

■ It’s important to note that multiple proxies can, 
and often, are present, so one proxy might just 
be reporting the IP address of another proxy 
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Identifying a Proxy 

* X-Forwarded-For IP Address as seen in 
traffic: 



GET /HTTP/ 1.0 

User- Agent: Moalla/4. 0 (compatible; MSIE 6.0; Windows ITT 5.1; SV1) 

H o st: www eb ay. c om 

Pragma. no- cache 

Yia: 1 0 s , i ozio cl: net. c am (' a quid/ d . U . S 1 'A BLE 1 0) 

C a che - C ontr ol: max - age =2592 0 0 

Connection: keep-alive 
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Some Examples of X-Forwarded-For headers: 




X-Forwarded-For: 



X-Forwarded-For: 



X-Forwarded-For: 



X-Forwarded-For: 



X-Forwarded-For: 




X-Forwarded-For: 192.168.110, 10.0.0,22, 



X-Forwarded-For: 



X-Forwarded-For: 127.0 0.1 



X-Forwarded-For: 



|google.com : 



Multiple-Layers of Proxies! 

In-general, the first IP is the one closet to the original requestor 
Keep in mind - these can be totally fake 
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Identifying a Proxy 



* Similar to the X-Forwarded-For Tag is the 
“VIA tag” 

• The VIA tag is the proxy identify itself 




GET } ETTPH . 0 

U s er -Age nt : Mo zilla/4 . 0 (c o mp atibJ e ; MSIE 6.0; ’Win do ws NT 5.1: S V 1 ) 

He st : www, eb a y . c oin 

Vk STABLE 10) ] 

X-F orwardet-F or: 217.21 “ 135 

Cache- Control: ma»age“259200 

Connection: keep-alive 
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Identifying a Proxy 

* The Via: tag may even contain some good 
information about the proxy 

* Be careful though because this information 
could be falsified: 



Via: 1 . 0 tehran-proxy- srv:3 1 2 S (s quid/ 2. 5 . STABLE 1 ) 
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Identifying a Proxy 

Remember though that the X- 

Forwarded-For and VIA lines can be falsified 
and don’t have to be present! 

If they’re not present, how can you tell the IP 
address is a proxy? 

Test it in MARINA! 
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Testing IP Addresses in MARINA 

• The primary side effect of a proxy is too 
many users online at the same time 

■ So if all else fails, try querying on the IP 
address (assuming its USSID18 compliant of 
course!) in MARINA to see how many users 
were active within an hour time frame 

* It’s not scientific but generally it will help 
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Testing IP Addresses in MARINA 

For example look at these results: 




There were 274 unique “Active Users” in that 
hour, think it’s a proxy? 
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Hi IP Header Fingerprint (HHFP) 



DERIVED FROM: NSA/CSSM 1-52 



TOP SECRETtfCOIVlINTMQRCONifREL TO USA, AUS, CAN, GBR, NZL 



08 




TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 



What is the HHFP? 



GCHQ created the HHFP to help identify 
individual users behind a single proxy IP 
address 

The HHFP is a hash of multiple header fields 
that can be used to identify a single user 
behind a proxy 
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What is the HHFP? 

At least one of these values must be present: 

■ X-Forwarded-For IP Address 

- Via 

- Client IP address 

If so, the HHFP is a hash of those values 
combined with the User Agent string 
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What is the HHFP? 

EX: Here’s an Iranian proxy IP 
Address that has multiple HHFP’s 
underneath it. 

NOTE: There’s no guarantee that 
an HHFP is identifying a single 
unique user, it’s entirely possible 
that more than one user will have 
the same HHFP 
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32 ) c% 
Hooaijee2cC1)3%- 

(g013a7D7f (1)3% 
gG932e333 (1) 3% 
gO-accled* fl ) 3% 
|g0bs2b5rl Cl) 3% 
gCced7c4B (1) 3% 
@13312787 fl ) 3% 
-yi35cad;3 [1) 3%- 
g 19429343 [1) 3% 
glSddalfa (1)3% 
g 1^1 71 521 (I) 3% 
Sldd13d35CD 3% 
g 1 1661 ca6 (1 ) 3% 
g20f8c73f(1)3%- 

g2191 03(0(1) 3% 
g23Sc2crb (1)3% 
g23e57929 Cl) 3% 
S2fl01S04dC1)3%- 
g2d5Q4fe' (1)3% 
g21Bbad21 (1)3% 
g31b545hd Cl) 3% 
g 3a07f51 5 (1 ) 3% 
g3c705351 Cl) 3% 
g 453405(9 (1)3% 
gS470cbdD(1)3% 
H?3136ecd (1) 3% 
gS41197d9 (1)3% 
SSb33S2a2C1)3% 
ga011c614(1)3% 
gac062531 CD 3% 






TOPSECRET//COMINT//REL TO USA, AUS, CAN, GBR, NZL 



Pros and Cons of HIHIFP 

On the positive side, the HHFP is a single 8 digit 
value which can help identify a single user behind a 
proxy 

On the negative side, it requires an XFF IP 
address, Via string or Client IP Address and since 
many sessions do not contain all three, they’ll have 
no HHFP string 

Also even with the HHFP, all of the fields that are 
used to build it are available in the XKS HTTP 
Activity query so it’s not providing you with any 
data you don’t already have access to 
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XKS’s HTTP Activity Search 



DERIVED FROM: NSA/CSSM 1-52 
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XKS HTTP Activity Search 



After that overview of how HTTP Activity 
works, let’s look into how to effectively 
target it through XKS queries 
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XKS HTTP Activity Search 

* HTTP Activity indexes every HTTP 
session 

' Client-to-server and server-to-client 

■ Can be queried on any of the unique 
HTTP meta-data fields or any of the 
“standard” DNI fields (IP Address, SIGAD, 
CASENOTATION etc). 
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f 



HI P Ty;iH 



Hast 



IJRL Fcth 



UR Arijs 



e.= r :i Term 



Lanfliiaae: 

Acti'je User: 
TD[ Tyne; 
TDI: 

Character Enmdng: 



Content £Ldrl: 
Ccrtent Stop: 
C'ortnnt [Fatal: 



XKS HTTP Activity Search 

■ Unique Meta-data fields of this search 
include: 



Fields already covered in this training: 



F.ororer: 



y, Forwarded Fo 



Via; 



5 rosy Hash (HHFP): 
cookie: 
Browse- - : 



Atta cr me ivt Fi le n a m e 



Farrar i ypa: 



Geo Info [ f ilrR-'t ) 



Mi sc Info Bultoitl; 



.inks or Interes 
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XKS HTTP Activity Search 



■ In addition to all of the common fields like: 
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XKS HTTP Activity Search 

• Most commonly HTTP Activity query 
searches in XKS will be to enable 
“persona analysis” 

* Based on MARINA, TRAFFICTHIEF or 
PINWALE, we’ll want to query XKS to 
discover all of the HTTP Activity that 
occurred around the targets session of 
interest 
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Simple HTTP Searches 

In order to do a “persona analysis” type 
search, all well need to fill in is the IP of 
the target (assuming it’s USS1D18 
compliant) and a short time range “around” 
the time of the activity: 
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XKS HTTP Activity Search 



Another common query is analysts who 
want to see all traffic from a given IP 
address (or IP addresses) to a specific 
website. 
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XKS HTTP Activity Search 

* For example let’s say we want to see all 
traffic from IP Address 1 .2.3.4 to the 
website www.website.com 

* While we can just put the IP address and 
the “host” into the search form, remember 
what we saw before about the various host 
names for a given website 
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Host Field 

It’s important to note, that in many cases users think 
they’re at websites like www.yahoo.com , but behind 
the scenes data is coming from a number of 
different servers without the user knowing it: 



GET /:']] c fmc d'liie s/ur/ab C oat acts 7mcn^b=jdID ba? jsn & israjid—i El] 3 / EO 7 S: fand=2 1 Z7G 5 345 9 HTTP/ 1 . .] 






C: L^t-Iiii^Lii^y=; fa 

http 'Jim [Jju 51 5 irjaiJ yahoo . C o : rJr n^i't: ji c wF o 1 d ti = _ylu— X7 oDT-iI.P.ijl: rilicb GH QB F STA id '( 5 GiDMv J 'MT 
A yWwR hT^TJkZWsNc2dz7m; d= I 2 1 S57_AEP. ksEIAA Nvj Si6wITQ7fflZ a *T&fi<i-Tnbo ?;&? prt=date&e 
rder=up £startZ£i-l=3 r> &dlterE7= 

I h rc quested- wist: XMLI EttpBje qu e s t 

| Acuept-Enccding: gzra. deflate 

ifajiyin fan e.o ; iwiiki™ nt 5.l m .net clh 2 0 . so? 27] 




Vi— i vA XI Fv :i Yn 7 nn ri J';-:v/:v,R -AT i- 2| UKZT <wiivy < iK StimkGD ZV 7 a! hE 9 5 dL a Z 5 C Qx 1 i-iT)li;T ::ari f> vpi 

ad?XvBO eaj5I<r 1 

I.r=t 
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XKS HTTP Activity Search 

In order to account for all of the possible 
host names, we must front-wildcard the 
host name. 

* Be careful when front-wildcarding 
because beyond being resource intensive 
for XKS, it can be dangerous from a 
USSID18 perspective 
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Hints for wildcarding a host name 



If you’re trying to query for traffic to the 
website www.website.com the best way to 
wildcard it is: 

*.website.com 

Notice that the . before the hostname 
website is still there, that way we will 
properly hit on ads.website.com 
images.website.com but avoid the false 
hits on www.anotherwebsite.com 
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Hints for wildcardang a host name 




Why are we only interested in traffic 
coming from our IP of interest going to 
our website of interest? 
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forum 



5howthro,itl.|>hp 
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Helpful GUI Shortcuts 

Earlier we talked about how XKS broke a 
GET request into the URL Path and URL 
Argument (separated by a ?) 

Ex: http: //forum 



showthread.php?t=1 31485 



Get’s broken out to 
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for Lim 



j^howthread.php 



1=131455 



lor urn 



Host 
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Helpful GUI Shortcuts 

So if we were to query for this URL we 
would need to enter those fields in 
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Helpful GUI Shortcuts 

■ Or we could use the “URL Field Builder” to 
simply copy and paste the full URL and let 
XKS break it into its appropriate parts: 
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Helpful GUI Shortcuts 



URL Field Builds 

tnter a URL that will be automatical lv parsed to populate the host, 
path, and argument fields: 



http: //forum I 



*s ho wthread .p hp?t = 1 3 148 5 



Ei iltr 



Cancel 




Host: 



forum 



URL Path: /showthread php 



URL Arg=: t=1 31 485 
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